Search CORE

4 research outputs found

Accelerating pairwise sequence alignment on GPUs using the Wavefront Algorithm

Author: Aguado Puig Quim
Publication venue: Universitat Politècnica de Catalunya
Publication date: 19/10/2022
Field of study

Advances in genomics and sequencing technologies demand faster and more scalable analysis methods that can process longer sequences with higher accuracy. However, classical pairwise alignment methods, based on dynamic programming (DP), impose impractical computational requirements to align long and noisy sequences like those produced by PacBio, and Nanopore technologies. The recently proposed Wavefront Alignment (WFA) algorithm paves the way for more efficient alignment tools, improving time and memory complexity over previous methods. Notwithstanding the advantages of the WFA algorithm, modern high performance computing (HPC) platforms rely on accelerator-based architectures that exploit parallel computing resources to improve over classical computing CPUs. Hence, a GPU-enabled implementation of the WFA could exploit the hardware resources of modern GPUs and further accelerate sequence alignment in current genome analysis pipelines. This thesis presents two GPU-accelerated implementations based on the WFA for fast pairwise DNA sequence alignment: eWFA-GPU and WFA-GPU. Our first proposal, eWFA-GPU, computes the exact edit-distance alignment between two short sequences (up to a few thousand bases), taking full advantage of the massive parallel capabilities of modern GPUs. We propose a succinct representation of the alignment data that successfully reduces the overall amount of memory required, allowing the exploitation of the fast on-chip memory of a GPU. Our results show that eWFA-GPU outperforms by 3-9X the edit-distance WFA implementation running on a 20 core machine. Compared to other state-of-the-art tools computing the edit-distance, eWFA-GPU is up to 265X faster than CPU tools and up to 56 times faster than other GPU-enabled implementations. Our second contribution, the WFA-GPU tool, extends the work of eWFA-GPU to compute the exact gap-affine distance (i.e., a more general alignment problem) between arbitrary long sequences. In this work, we propose a CPU-GPU co-design capable of performing inter and intra-sequence parallel alignment of multiple sequences, combining a succinct WFA-data representation with an efficient GPU implementation. As a result, we demonstrate that our implementation outperforms the original WFA implementation between 1.5-7.7X times when computing the alignment path, and between 2.6-16X when computing only the alignment score. Moreover, compared to other state-of-the-art tools, the WFA-GPU is up to 26.7X faster than other GPU implementations and up to four orders of magnitude faster than other CPU implementations

UPCommons. Portal del coneixement obert de la UPC

Deployment of a HPC operational service

Author: Aguado Puig Quim
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 11/02/2019
Field of study

La cendra volcànica en suspensió crea greus problemes de seguretat i econòmics al sector aeronàutic. Les erupcions volcàniques que hi ha hagut fins ara han demostrat que és difícil aconseguir prediccions precises dels núvols de cendra. En aquest treball s'ha dissenyat i implementat un component per controlar processos HPC (en aquest cas, simulacions de dispersió de cendra volcànica), en un sistema operacional en temps real.Atmospheric dispersion of volcanic ash creates important economic and safety problems for the aviation industry. Past volcanic events have shown that is difficult to access precise forecasts of volcanic ash clouds. This project shows the design and implementation of a component for controlling remote HPC jobs (in this case, volcanic ash dispersion simulations), in a real-time operational system.La ceniza volcánica en suspensión crea graves problemas de seguridad y económicos en el sector aeronáutico. Las erupciones volcánicas que ha habido hasta ahora han demostrado que es difícil conseguir predicciones precisas de las nubes de ceniza. En este trabajo se ha diseñado e implementado un componente para controlar procesos HPC (en este caso, simulaciones de dispersión de ceniza volcánica), en un sistema operacional en tiempo real

Diposit Digital de Documents de la UAB

Accelerating edit-distance sequence alignment on GPU using the wavefront algorithm

Author: Aguado Puig Quim
Castells Rufas David
Espinosa Morales Antonio
Marco-Sola Santiago
Moreto Planas Miquel
Moure López Juan Carlos
Álvarez Martí Lluc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/06/2022
Field of study

Sequence alignment remains a fundamental problem with practical applications ranging from pattern recognition to computational biology. Traditional algorithms based on dynamic programming are hard to parallelize, require significant amounts of memory, and fail to scale for large inputs. This work presents eWFA-GPU, a GPU (graphics processing unit)-accelerated tool to compute the exact edit-distance sequence alignment based on the wavefront alignment algorithm (WFA). This approach exploits the similarities between the input sequences to accelerate the alignment process while requiring less memory than other algorithms. Our implementation takes full advantage of the massive parallel capabilities of modern GPUs to accelerate the alignment process. In addition, we propose a succinct representation of the alignment data that successfully reduces the overall amount of memory required, allowing the exploitation of the fast shared memory of a GPU. Our results show that our GPU implementation outperforms by 3- 9× the baseline edit-distance WFA implementation running on a 20 core machine. As a result, eWFA-GPU is up to 265 times faster than state-of-the-art CPU implementation, and up to 56 times faster than state-of-the-art GPU implementations.This work was supported in part by the European Unions’s Horizon 2020 Framework Program through the DeepHealth Project under Grant 825111; in part by the European Union Regional Development Fund within the Framework of the European Regional Development Fund (ERDF) Operational Program of Catalonia 2014–2020 with a Grant of 50% of Total Cost Eligible through the Designing RISC-V-based Accelerators for next-generation Computers Project under Grant 001-P-001723; in part by the Ministerio de Ciencia e Innovacion (MCIN) Agencia Estatal de Investigación (AEI)/10.13039/501100011033 under Contract PID2020-113614RB-C21 and Contract TIN2015-65316-P; and in part by the Generalitat de Catalunya (GenCat)-Departament de Recerca i Universitats (DIUiE) (GRR) under Contract 2017-SGR-313, Contract 2017-SGR-1328, and Contract 2017-SGR-1414. The work of Miquel Moreto was supported in part by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal Fellowship under Grant RYC-2016-21104.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

OpenCL-based FPGA accelerator for semi-global approximate string matching using diagonal bit-vectors

Author: Aguado Puig Quim
Alvarez Martí Lluc
Castells Rufas David
Espinosa Morales Antonio
Marco-Sola Santiago
Moreto Planas Miquel
Moure López Juan Carlos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

An FPGA accelerator for the computation of the semi-global Levenshtein distance between a pattern and a reference text is presented. The accelerator provides an important benefit to reduce the execution time of read-mappers used in short-read genomic sequencing. Previous attempts to solve the same problem in FPGA use the Myers algorithm following a column approach to compute the dynamic programming table. We use an approach based on diagonals that allows for some resource savings while maintaining a very high throughput of 1 alignment per clock cycle. The design is implemented in OpenCL and tested on two FPGA accelerators. The maximum performance obtained is 91.5 MPairs/s for 100 × 120 sequences and 47 MPairs/s for 300 × 360 sequences, the highest ever reported for this problem.This research was supported by the EU Regional Development Fund under the DRAC project [001-P-001723], by the MINECO-Spain (contract TIN2017-84553-C2-1-R), by the MICIU-Spain (contract RTI2018-095209-B-C22) and by the Catalan government (contracts 2017-SGR-1624, 2017-SGR313, 2017-SGR-1328). M.M. was partially supported by the MINECO under RYC-2016-21104. We thank Intel for granting us access to the DevCloud system and let us join the HARP research program. The presented HARP-2 results were obtained on resources hosted at the Paderborn Center for Parallel Computing (PC2) in the Intel Hardware Accelerator Research Program (HARP2).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC